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Abstract: Algorithms for the automatic evaluation of elementary functions 

were studied. Available algorithms obtained from current literature were 
analyzed to determine their suitability for hardware implementation, in 
terms of their accuracy, convergence rate, and hardware requirements. The 
functions considered were quotient, arctangent, cosine/sine, exponential, 
power function, logarithm, tangent, square root, and product. 
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1.0 Introduction 



The purpose of this study was to identify and analyze mathematical 
algorithms for possible hardware implementation. Emphasis was placed on 
the evaluation of elementary functions, such as square root, logarithm and 
exponential, trigonometric functions, and multiply and divide. The study 
was directed toward algorithms for binary computers, although some references 
are included which address themselves to radices 10 [56], 16 [18], and -2 
[53], [54], [65]. 

We have not assumed that any fixed degree of accuracy was required. 
Rather, we have generally concentrated on methods which are flexible enough 
for the accuracy to be a function of the number of iterations performed. 

We have generally assumed that extremely low precision is not sufficient, 
thus ruling out consideration of methods which are essentially table-look-up, 
with or without interpolation (e.g., [23], [52], [57]). Low precision inter- 
polation, such as Mitchell [43], and variations of it ([11], [27], [39]) 
have not been studied. 

The commitment of a large amount of hardware can sometimes be used to 
decrease the execution time for evaluation of functions. The use of cellular 
arrays has been proposed for multiplication [13], division [37], [62], [3], 
square root [9], [22], [38], and logarithm [15]. While this idea is promising 
in terms of speed, we have concentrated on algorithms to be implemented with 
the use of only a moderate amount of parallel processing. 

A number of the proposed algorithms are unified in the sense that with 
variations in certain parameters the same procedure can be used to evaluate 
any one of several functions. A description of three algorithms of this type 
will be given in Section 3, before proceeding with a discussion of algorithms 
for specific functions in Section 4. A short consideration of error analysis 



is given in Section 2. 



2.0 Error Analysis Considerations 



The error involved in evaluating a function consists of three parts: 

(i) Roundoff error is accumulated in doing the necessary arithmetic; (ii) 

The methods are interative, and convergence is obtained only to within a 
specified tolerance, thus truncation error occurs; and (iii) If the inter- 
mediate results are carried to additional bits of accuracy to reduce roundoff 
error accumulation, an error is committed by reducing the number of bits in 
the final result to machine precision. 

We will generally assume that we are dealing with fractional numbers 
which involve N fraction bits. This is compatible with the notion of 
floating point arithmetic, where the value of the exponent, or characteristic, 
is taken care of by a separate normalization procedure. This may be performed 
before or after the algorithm we discuss. More is said about this in 
discussing the various functions. 

Suppose that M operations where roundoff error may occur are performed. 

If the error committed each time is bounded by e, the total error can not 

exceed Me. Assuming N fraction bits, we would probably want the roundoff 

-N-1 -N-1 

error to be bounded by 2 , or Me < 2 . Thus - > N + 1 + log^M. 

Roundoff error is decreased by increasing the precision of the intermediate 
results; say, use J "guard” bits for a total of N + J fraction bits. Then 



( 1 ) 



e = K • 2 



-N-J-1 



> 



where 




rounding 
chopping * 



Thus, (1) in the above, yields - ^^^2^ + N + J + 1>N + 1 + log^, 



*The term chopping will refer to simply truncating after N + J bits without 
regard to the N + J + ISS- bit. The term truncation error will be defined later. 



or 



( 2 ) 



J > 



i 



log^M for rounding 
log^M + 1 for chopping • 



The truncation error in these algorithms occurs when a function g(x) is 

to be evaluated at a point a, and actual evaluation point is a - y. This 

error is the result of truncating an infinite process after a finite number 

of steps. If g is differentiable, which it is in our case, the difference 

in the function values is g(a) - g(a ~ y) = g’ (oi*)y, where a* is between 

a and a - y. The number y is related to the convergence criterion used in 

-N-1 

the algorithm, and in most cases it will be bounded by 2 , the same as the 

roundoff error bound we will assume. 

The choice of bound for truncation error and roundoff error should be 

undertaken together, since it does not make good sense to choose either so 

that the truncation or roundoff error bounds differ significantly from each 

other. That is, it would not be meaningful to do enough iterations in a 

-20 

calculation to make the truncation error as small as 2 if the roundoff 

error could be as big as 2 Conversely, it is wasteful to hold roundoff 

error to 2 if one is trying to obtain accuracy to 2 

In line with the above proposed selection of J, the truncation error 

.-N-1 



bound should be made about 2 



.-N 



also. The total error could then be as 



large as 2 , in the calculated value. It is then necessary to chop or 

round this result to N bits, which introduces an additional error of at most 
-N -N-1 

2 or 2 , respectively. The final value could then be in error by as 

-N -N 

much as 2*2 or 3/2*2 . These are bounds , and in the usual case the error 
will be smaller. One should keep in mind that they are sharp bounds, in 
the sense they may be approached closely in a given case. 
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It appears that it would certainly be worthwhile to round to N bits 

after the final iteration, since this procedure gives a significantly smaller 

total error bound than that for chopping. If the error bound given previously 

for the results of the iterations (before the final round) are not sufficiently 

accurate, the bound on this part of the calculation can be reduced to 2 ^ \rom 
”N 

2 by the performance of one additional iteration, using one more guard bit. 

-N“l 

The total error bound can never be made smaller than 2 , since that is the 

bound for the error in the final rounding operation. 

The error bounds given in Section 4 will be for the N -h J bit result, 
before the final round to N bits, the added error bound for the final rounding 
being the same for all cases where J > 0. 

3.0 Unified Algorithms 

Algorithms which, with small modifications, can evaluate one of several 
functions are basically of three types. One is formulated as a coordinate 
rotation problem, another is formulated as a ”pseudo-division/multiplication’’ 
process, while a third type is a normalization procedure. We will discuss 
each of them in this section. 

3.1 Coordinate Rotation Methods 

The Coordinate Rotation Digital Computer was first discussed by Voider 
[64], who considered rotations in the usual circular coordinate system. It 
was indicated that work was also done in hyperbolic and linear coordinate 
systems, but this was not reported in detail. An earlier report by Voider 
[63], was not available. Liccardo [31] did a master’s thesis on the CORDIC 
methods, and included the hyperbolic system. He also outlined procedures 
for multiply and divide. Linhardt and Miller [34] included the details of 
the hyperbolic system, but not the linear system. Walther [66] presented 
the unified algorithm and his paper tied together the rotations in the three 
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different coordinate systems. A paper by Perle [47] also appeared, which 



fit the pattern, in that it does not generalize to hyperbolic and linear 

systems. Very recently, Schmid and Bogacki [56] discuss the implementation 

of the CORDIC algorithm in radix 10. 

Basically, the CORDIC method involves taking a sequence of rotations 

(with radial distortion) of an initially rotated coordinate system. The 

initial angle of rotation is = z. A point “ (x,y) is specified 

by giving its coordinates with respect to the rotated coordinate system. 

We generate the following sequence of points by the rotations; (x^,y^), 

(x. ,y, ),..., (x,y ), where 
11 n n 



Here m is parameter indicating the type of coordinates for the rotations 



included a method of obtaining sin ^x. The algorithm for sin ^x does not 




(3) 




- 6 . s .X . 
Ill 



(1, 0, -1 for circular, linear, hyperbolic, respectively), s^ = ± 1 determines 
the direction of rotation, and the 6^ are specified constants. The angles 
z^ of the rotated coordinate system, and radii R^ of the radius vectors to 
the (x^,y^) are given by 




(4) 



a . 

1 



m tan (m 6^) , and 



R = (x 



o 
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is specified, as indicated previously. We see that each is the 

location of the (radially distorted) point (x ,y ), with respect to a 

o o 

coordinate system rotated through angle z^. Also note that 

z = z + a 
n o 

R = R K , where 
n 0 m 

(5) n-1 

a = y s . a . , and 
11 

1=0 



n-1 



K 

m 



= .n 

1=0 



9 

(1 + ^ 



are independent of the (x,,y^), except insofar as they may influence the s^. 

In terms of the initial values x and y , it can be shown that 

o o 



X = K {x cos (am^) + y raisin (am^)} 
n m o o 

( 6 ) 

% — ^ ^ 

y = K iy cos (am ) - x m sin(om^)} 
n m o o 



It is necessary that the initial values of x y and z be restricted, 

o o o 

both for purposes of representing them in the computer, and to guarantee 

convergence of the algorithm. We say more about this later. With the 

appropriate restrictions on judicious choice of the s^, 

the values of x ^y ^ and z can be forced to approach that values indicated 
n n n 

in Table 1. The terms rotation mode (s.’s chosen to force z to zero) and 

1 n 

vectoring mode (s^’s chosen to force y^ to zero) were coined by Voider and 
are descriptive. The choice of s^ is as follows. In rotation mode 






-1 



if z . < 0 
1 



if z . > 0. 

X 
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In the vectoring mode 



s 



i 




If y . > 0 

if y. < 0 

^ 1 



The latter is dependent on being non-negative* This is easily adjusted 

for in cases where is negative, as we will note in the pertinent sections. 

The choice of 6^ is critical to the entire procedure and in order to 

facilitate computation, we want each 6^ to be a power of two. Thus the 

transformations (in a binary computer) are accomplished by shifting and 

adding. In order that the radial distortion constant, K , be independent 

m 

of the input data, it is necessary that the magnitudes of the 6^ be indepen- 
dent of the input data. Thus the same sequence of rotations must always be done, 
except that the direction of rotation may vary. Walther gives the choice of 6^ 

given in Table 2. We also list the maximum value of a obtainable with this 

sequence, as well as the corresponding value of and the number of iterations, 

-N-1 

N , required so that - 2 
m ^ N 

m 

We will see that with appropriate range reduction techniques, discussed 
for the individual functions in the next section, that the sequences given 
in Table 2 are adequate. 

Because the last rotation of magnitude must be accomplished, the 

m 

value of can be as large as , in the rotation mode, and the value 

m m 

of can be as large as 6^ in the vectoring mode. The truncation 

m mm 

error for the CORDIC algorithm is based on , then. 

m 
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Rotation mode 



m 



lim X 
n-x» n 



lim y 

n 



lim z 

n 



K 



cos z - y sin z) ^]^^y z + x sin z) 
X y + x*z 



-1 cosh z + y sinh z) ^_^(y cosh z + x sinh z) 0 



Vectoring mode 



1 K^(x^ + 



z + tan"\/x 



X 



z + y/x 



■1 K_^(x^ - 



z + tanh"^y/x 



Table 1 



m 


sequence (6^ = 1 2 ^i^ 


Max a 


K 

m 




N 

m 


1 


0yl^2^ • • • ^ n 


-1.74 


-1.65 


N 


+ 2 


0 


n 


1.0 


1.0 


N 


+ 1 


1 


1,2, 3, 4, 4, •••» n 


-1.13 


o 

00 


N 


+ 1 + R* 



*Repeated values are 4,13,...,k, 3k + 1, R denotes the number of repeated 
values, e.g., if N= 16, R= 2. 



Table 2 
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~h “1 / *5 

The values of K and a, = m tan (ra 6.), i = 1,2,..., N , must be 
ml 1 m 

available, and as with most algorithms in this class, it is assumed these 
will be stored in a read-only-memory. Note that in practice, for the 
algorithm to operate for all three values of m, values of the for all 
three m, (tan ^2 2 and tanh ^2 would need to be stored. 

The greatest asset of the CORDIC algorithm is its versatility. It is, 
in fact, even more versatile than would appear at first glance, since functions 
related to those of Table 1 can be evaluated by pre - or post - manipulation 
of the data. For example, may be obtained by setting x=w + ^^y =»w-l^^ 
and entering the algorithm in the vectoring mode with m = - 1, followed by 
a division by K 

The main disadvantage of the method is that it requires a fixed number 

of iterations (rotations), whether needed or not, to avoid changing the 

constant K . DeLugish [16] gives a partial solution to the problem, which 
m 

we will discuss later. Also, the CORDIC algorithm could be used in part, 
as we note later, in Section 4.3.3, such that the radial distortion constant 
is eliminated. 

3.1.1 Error Reduction for Certain Functions 

For functions which are zero when the argument is zero, special pro- 
cedures must be used to obtain results which are accurate to N significant 
digits. Generally the normalization procedures handle this automatically, 
but we will discuss briefly some functions where this is not the case. 

When the sine, tangent, or arctangent of a small argument, in floating 
point form, is desired, we would like the result to be accurate to as many 
significant digits as possible. The procedures outlined in Section 4 do not 
result in this desired accuracy. We will note in Section 4 that all the 
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algorithms for these functions reduce to the CORDIC algorithm, essentially, 
and Walther [66] has treated this problem. The solution is to scale the 
argument by an appropriate power of two, beginning the rotations with an 
appropriately small angle and continuing for the usual number of rotations. 
This requires the inclusion of a set of distortion constants, depending on 
the initial angle of rotation, as well as additional values for the 
The details of the implementation can be found in Walther. The above should 
also be done for the hyperbolic system if the hyperbolic sine, tangent, and 
arctangent are desired. If the hyperbolic system is to be used to calculate 
the exponential, logorithm, and square root, the increased accuracy there 
is more apparent than real, since the initial or final transformation negates 
the increased accuracy. 



3.2 Pseudo-Division/Multiplication Methods 

Consider the division of y by x using the method of successive sub- 
tractions. This can be organized (in binary) as follows. Let y and x be 
normalized so that < x,y < 1. Then H < y/x < 2. Let the quotient be 



n 



represented by the number where each q^ is 0 or 1. The q^ 

i=0 

are determined by the following loop. Let = y , d^ = x, Qq ” 

For i = 0,1,..., n, let 



Then D 
d 
Q 



i+1 

i+1 

i+1 



- 2(D^ - 




'' 0 if D. ^ d. , or 

1 1 

* 

1 if D. < , 

1 i 
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n-1 



-1 



Then 



For multiplication, the procedure is reversed. Let x = ^ x,2 

i=0 ^ 

the product xv = is generated byP^ = 0, = y, and for 

1 0,1,«*«, n ” 1, 

"i+i ■ 20>i + 

>'i+i ' ’'i 

The product is assumed to be accumulated in a register of length 2n+l bits. 

At the end of the multiplication the product has been shifted n+1 bits, 

into the most significant n bits of the register, and chopping (or rounding) 

to the most significant n bits gives the result. Error occurs only when 

the least significant half is chopped or rounded off. 

A pseudo-division or multiplication is a procedure like one of the above, 

where the q.’s or x.’s are not necessarily the coefficients in radix 2, and 
11 

the divisior d^, or multiplicand y^, may be modified at each iteration. 

Meggitt [40] devised a class of these methods for division/multiplication, 

logarithm/exponential, tangent /arctangent , and square root. While Meggitt ’s 

paper was developed for radix 10, the conversion to any other radix is simple. 

Meggitt ’s algorithms correspond to a restoring division. The test for 

Df ^ d^ would be done by computing a tentative value of % ^ ^i+1^ ” 

D. - d., then testing % sign. If % is non-negative, q, = 1, 

1 i i"rl i+1 1 

and = 2(h If h is negative, = 0 and = 

2(% + d.). That is, we must "restore" the value of D. in the latter 

1+1 1 1 

case . 



It is not necessary to restore the value of D^, however, since it is 
possible to compute % directly from Suppose q^ = 0, then we 
have 
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Depending on , this expression may or may not be easier to compute 



it is difinitely easier to use the above, non-restoring division. We then 



non-negative or negative, respectively. 

Sarkar and Krishnaraurthy [55] modified Meggitt’s algorithms to 
correspond to a non-restoring pseudo-division/multiplication. This results 
in a faster algorithm. It would appear this is more advantageous in radix 
10 than in radix 2. The above mentioned paper incorporated a possibility 
of restoration or non-restoration, depending on which appeared to be more 
advantageous, i.e., which left the smaller dividend. 

We list in Tables 3 and 4 the initialization and iteration equations 
for Meggitt’s algorithms in radix 2. In cases where it is assumed a number 
is expressed in a variable radix (log(l + 2 or tan ^ 2 ^), this must be 
first obtained by a modified division. The procedure given by Meggitt is 
as follows, where D is to be recoded in one of the above radices. 



than by first restoring D.. In normal division, d.,- = d,, so in that case 

1 1+1 i 





+ d. if q. =0, and that q, is 1 or 0 as is 

11 ^i i 



D = D 



o 
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For i=0, n-1 

= 2^ log(l + 2“^) , [or : 2^ tan"^ 2"^] 






J 

1 if D. ^ d, 

1 i 

< 

0 if D. < d_, 

1 i 



Vl ■ '('’i - Ol-*!' 



At the end of the loop, we have 



n-1 n-1 

D = \ q log(l + 2 ^) , [or: I q tan 2*^] , 

i=0 i=0 ^ 

with remainder D /2^. 

n 

The advantage of the pseudo-division/multiplication processes is that 
they look very much like multiplication and division, and could be imple- 
mented with little additional hardware. The routines are inherently accu- 
rate and insensitive to roundoff error, and accuracy comparable to other 
methods can be obtained with only one guard bit. The use of one double 
length register is necessary, although with some reformulation this might 
be replaced by one having only a sufficient number of guard bits, about 
log^n+l* 

The evaluation of some functions require a modified division to recode 
an argument, followed by a psuedo-multiplication, or a pseudo-division, 
followed by a modified multiplication, and as outlined, some of these 
must be done serially, not in parallel. 
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log(l+y/x) tan y/x 



I? 



X 




rH 

<U 



4J 

O 

0) 

CU 

W 

Q) 

U 



•H 

X 

V 



[S3 

U 

o 

•H 

X 

A1 



•r 

N 
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U 

o 

o 

II 

•r 
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0) 

4-> 

o 
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•H 


+ 


•H 
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•H 
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cr 
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T3 
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CO 




CO 
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Table 3: Meggitt Pseudo-division 



I 

*H 

I 

CNJ 

X 

+ 



<N 



CL 



•H 
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CO 

H 
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CL 
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+ 

CL 
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CL 

•H 



X 

I 

O 

:3 

<D 

CO 

P-. 



u 

XJ 

0 

CL 

1 

o 

XJ 

3 

0) 

CO 

CLi 



X) 

G 

CO 

o 

•H 

I rH 
O CL 
XJ *H 
CJ -u 
<D rH 
CO G 
CL, 
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Table 4: Meggitt Pseudo-Multiplication 



3.3 Normalization Methods 



Consider a function of two variables which is to be subjected to a 
series of transformations under which the function value is to be invariant. 
The first variable will be transformed to a specified value while the second 
is altered in such a way that its value is forced to approach the desired 
function value. 

For example, consider the function y + log x. Let x = x, y = y, and 

o o 

define the sequence (x^,v^) by ^i+1 ^ ^i ^i* ^^ere the 

a^ are positive constants. 

Then we have 

y + log X = y + log X = y + log x = . . . = y + log x . 

o 01 1 n n 

If the a^ are chosen in such a way that 

lim X = 1, we see that lim y = y + log x. 
n-x» n n-x» n ^ 

Thus, as X is "normalized** to one, y is forced to the desired function 
n n 

value. 

The above is but one example of a function evaluated by a normalization 
method. Others amenable to the same approach are given in Table 5, along 
with appropriate transfonnations for the variables. 



Function 








y + log X 


(x,y) 




(l,y + log X) 


y/x 


(x,y) 




(i,y/x) 


w/x'^ 


(x,y) 




(i,y/x 


X 

ye 


(x,y) 


(x^ - log 


(0,ye^) 



Table 5: Normalization Methods 
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A number of authors have addressed themselves to the implementation 
of the normalization methods, including (in historical order), Speaker [61], 
Perle [46], DeLugish [16], and Chen [5]. The principal matter to be decided 
is how the a, should be chosen. Clearly a . is to be of the form 1 + s^2 i, 
where p^ is an integer, and s^ = ± 1 or 0. Ideally, the a^ should be chosen 
so that is forced to within a specified tolerance of its limit value for 
as small an n as possible. We shall discuss the various strategies for 
choosing the a ^ as we consider the various functions. 

Chen suggests the use of a termination algorithm which decreases the 
number of iterations required. Essentially the idea is to use a one term 
Taylor series expansion when the most significant half of the number has 
been computed. This is generally done at the expense of a half precision 
multiply (i.e., the most significant half of the multiplier is represented 
by all zero bits). It does have an additional advantage in that it halves 
the round-off error accumulation since only about one-half as many iterations 
are required. We will discuss the individual termination algorithms in 
the next section. 

The advantage of the normalization methods is their simplicity (depending 
on how one decides on > ^^d their potential speed. Of course they are 
not as versatile as the CORDIC algorithm and no normalization methods have 
been devised for trigonometric functions. 

4.0 Algorithms for Specific Functions 

4.1 Quotient 

There are several kinds of division algorithms. We have previously 
mentioned the CORDIC, successive subtraction, and normalization algorithms. 
Another normalization method using multiplication and based on Newton’s 
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method has been proposed. This procedure converges quadratically , and 
thus is best suited when high precision is required- Flynn [20] discusses 
a number of iterative techniques for division. Another method is based on 
a reciprocal generator. Huber [24] did a thesis on binary division algorithms 
which discusses most of the above methods. 

Suppose that we desire the quotient , where Y = 2^«y, X = 2^*x, 

with a, 6 integers and h, < x,y < 1, that is X and Y are expressed in 
normalized form. In some algorithms, we will assume y < x, relaxing the 
requirement that both x and y be in [^,1). This leads to the quotient being 
in [hyl) y hence no normalizing shift would be required for the quotient. 

In any case, we concern ourselves with the generation of the quotient y/x. 



4.1.1 CORDIC Algorithm 

Inspection of Tables 1 and 2 reveal that N + 1 iterations are required. 
Because of the sequence of a’s for this case we must have Y < x; this can 
be accomplished by right shifting Y one place, if necessary, or always. To 
avoid a possible left shift to normalize the quotient, the right shift of 
y should be done only if necessary. 

After N = N + 1 iterations we obtain 
o 

N 

o 



+1 ~ 



^ L s.a., 
° i=0 ^ ^ 



N 



then 



+ 1 ' 

o 



|y - X y s.a. I 

' O O 1 l' 



< X a = X a 
N N o N 



Since 

o 



N 

o 

^ ^i^i^ above yields, with “ 0 > 



|Y /X 
o/ o 



- z 



N +1' 
o 



< a. 



N 



= 2 



-N-1 



Thus the truncation error is bounded by 2 



-N-1 
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If a left shift were necessary to normalize the quotient, the error would 

-N+1 

become as large as 2 , thus we see why it is important to right shift 

Y only if necessary, 

4,1.2 Restoring and Non-restoring Division 

It would again be advantageous to have y < x and y/x in [^,1) to avoid 
a normalization shift at the end of the operation. Also, one would then 
avoid generating N+1 quotient bits when only N are necessary, since the 
first bit would always be 1. 

If N quotient bits are generated, there will be as many as N 
subtractions of the divisor from the dividend. Assuming left shifts of the 
dividend rather than right shifts of the divisor, no roundoff error will 
occur during these operations. The truncation error will be determined by 
the size of the remainder divided by the divisor, and will be bounded by 
K*2 ^ where 

I I if a final round is performed 

2 if chopping is used 

Thus the error is at most one bit. 

If N + 1 quotient bits are generated with a possible right shift 
required to normalize the quotient, the error bound is the same. 

Nandi and Krishnamurthy [44] proposed a procedure for a non-restoring 
type of division for radix 8. It obtained the quotient in redundant form, 
and was especially designed for divisors with leading coefficient 1 or 8 ~ 1* 
T^ile the procedure could be adapted radix 2, there is a conflict in 

the decision process, which differs depending on whether the leading coefficient 
of the divisor is 1 or 8 “ 1* These digits coincide in radix 2, of course. 
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The decision process for the quotient bits is somewhat complicated, as 
well. 



The advantage of redundant representation (using ± 1 and 0 as coefficients, 

instead of just 1 and 0) of function values is that more zero bits can be 
*’built in**. Every number has a minimal representation, i.e., a representation 
with a minimal number of non-zero bits. Metze has investigated this idea 
for several functions, one of which is division [41]. Basically the idea 
is an extension of non-restoring division, and contains S-R-T division [50] 
as a special case. The decision as to the value of the quotient bit is 
based on the value of the previous remainder. The decision is somewhat 
complicated, with the comparison constant a function of the divisor. 

The following procedure is that given by Metze for minimally represented 
quotients. The comparison constant, K, is obtained from Table 6. We first 

assume that < x < 1 and 0 < y < x. We just as well assume that < y/x < 1. 

n _ . 

Let Q = y/x = q.2 Q_ = 0, 2R_ = x. For i = 0,1,..., N , 

1=0 ^ 



determine by 



q. = 



1 


if 


1— 1 

1 

•H 

CM 


> K 


1 


if 




< K 


0 


Otherwise 





Then 2R 

Q 



i 

i 



= 2(2R^_^ - 
= Qi-1 + 



X) 



The comparison constants given in Table 6 are not particularly con- 
venient, but it is the smallest number of different ones possible. Another, 
more convenient set is given in Table 7. 

For S-R-T division the comparison constant is K = ^ for all divisorsf 
and a minimally represented quotient is obtained for divisors in the range 
[3/5, 3/4]. 
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Range of divisor K 

[1/2, 39/64) 13/32 

[39/64, 3/4) 1/2 

[3/4, 15/16) 5/8 

[15/16, 1) 3/4 

Table 6 

Range of divisor K 

[1/2, 9/16) 3/8 

[9/16, 5/8) 7/16 

[5/8, 3/4) 1/2 

[3/4, 15/16) 5/8 

[15/16, 1) 3/4 

Table 7 
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A minimal recoding can be obtained by requiring that non-zero bits be 

N+1 

separated by at least one zero, thus at most ["’^~] bits can be non-zero 
(see [41], first paragraph). Because of the left shift of the remainder, 

P^, no roundoff error is introduced. It is easy to prove that ^ 

— k ^ 

y = x-Q, + 2 R- thus for k = N, we have truncation error y/x - Q = — , 

K K. N X 

We show by induction that |R^| < x. = x/2, and assume that 

Then if = 0, ^ ” ^^^k-1^ selection rule 

for q^, and the recursion formula. Tables 6 and 7 both show that K < x, 

thus IR^I < X# = ± 1, we have R^ = 2R^_^^ - sign (R^ or 

IRj^I = - xI < X, since by induction I I < Thus IR^^I < x 

for all kj and in particular, |R^| <x, thus the truncation error 

2"^R 

^ N -N 

— - — <2 , or less than one in the least significant bit. 

Note that no shift for normalization is required. In order to be 

assured that y < x, however, it may be necessary to right shift y initially, 

-N-1 

If no guard bit is provided, this could cause an error of 2 in y , and 
since 2 ^ ^/x < 2 we see that the error could be as large as 2 ^ or 

the last two bits could be in error. The error in y can be eliminated with 
one gaurd bit, 

Metze makes no mention of the average number of non-zero quotient bits. 
The probability of a zero occur ing in S-R-T division is about .63, and Metze *s 
algorithm must do better than that. We noted previously we must have about 
one-half of the digits equal to zero, although it is unlikely that in the 
average case, half of the remaining digits will also be zero. The probable 
number of non-zero bits is likely between N/4 and N/3. 
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4, 1*3 Normalization Methods 



As we noted earlier, the principal matter to be decided here is a 

strategy for the choice of a, (see Table 5). We assume that a, = 1 + s.2 ^i , 

1 i X 

where s^ = ± 1 or 0, and p^ is an integer. The iteration procedure is 

X = X, y = y, and for i = 0,1,2,..., n-1 
o o 

^i+1 " ^i^i 
^i+i = • 

Further we assume that ^ < x < 1. We discuss several strategies for 



choice of a.. 

1 



The first is very simple, and reminiscent of the CORDIC algorithm. 
/ 



1 + 2 



-1 



if X. < 1 
1 



a. = 

1 



.“1 



1-2 if x.>l 

1 









It is easily seen that 1-2 < <1 + 2 , thus after n iterations 

'Vl ■ ■ ' Vl ■ Vl/*n+l' ■ 



y U 2'" ^ - 2-"y/x . 

■‘n+l ’‘n+l 



Thus, the relative error is 2 and if x and y are normalized as usual, 

s t 

h, < y/x < 2 , hence the truncation error is less than one in the (n- 1 ) 
significant bit. Roundoff error would accumulate up to 1082 ^^ ^ ^ bits, 
thus if n = N + 1 iterations are used, log 2 (N + 1) + 1 guard bits would 
give an error no greater than one in the last bit. 

Another similar strategy is 

1 + 2 "^ if x^(l + 2 "^) < 1 



a. = 
1 



otherwise 
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This division procedure is the analogue of the Speaker [61] and Perle [46] 

algorithms. It has the seeming advantage of skipping some iterations 

(if a^ = 1), and the real advantage of keeping < 1. However, note that 

the decision requires calculation of a tentative value of ^ , hence no 

i+1 

saving is obtained there. The roundoff error would be decreased in the 
average case, but not the bound. Truncation error is the same, although 
one knows its sign and could compensate after the last iteration. 

DeLugish [16] gives a more sophisticated strategy for choice of a^. 
DeLugish normally requires an initialization procedure. In this case it is 



2 


if 


1/2 < X 

o 


< 3/4 


1 


if 


3/4 ^ X 

o 


< 1 



and X- = a X , y_ = a y 
1 o o 1 o o 

Now, as one would likely do in practice in the previous examples, 

DeLugish does not actually compute the sequence of x^*s. Rather, for ease 

in testing, we compute the sequence = 2^(x^ - 1), and then ~ 2u^ + 

s. + s.u,2 ^ , Here p. = i+1, and s. is determined by 
111 1 1 

1 if u, < - 3/8 

1 



s. = < -1 if u, > 3/8 

1 1 



0 otherwise 

Delugish shows that I x^ - l| ^ 3/8*2 for i ^ 2, thus I “ 3/8*2 

so truncation error is no more than 3/4 that derived for the earlier examples. 

No study was made of roundoff error, but the bound depends on the number of 

iterations with s^ 4 0. DeLugish shows the probable number of non-zero s^*s 

is about N/3. The maximum number is not discussed. However, it is easy to 
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show that not more than two successive s. can be non-zero. For i > 2 . 

1 * 

I - 1| < 3/8 • 2 ^ thus lu^l < 2^*3/8*2 ^ ^ = 3/4. Now, suppose 

u. > 3/8 and thus s. = - 1. Then u.,- = 2u. +s. +s.u.2 ^ < 2*3/4 - 1 = 1/2. 

Also we have ^ (2 - 1/4)* 3/8 - 1 = - 11/32 > - 3/8. Then, suppose 

3/8 < < 1/2, hence ~ since otherwise = 0. Then 

u^_j _2 = ^i+l^i+1^ ^ ^ ^ (2 - l/8)*l/2 - 1 = - 1/16, and as before 

u^_l _2 > 3/8. Thus = 0, as was to be proved. If u^ is negative, the 

2N 

argument is symmetric and the result follows. Thus we have that log 2 ~ + 1 

guard bits are sufficient, with chopping. 

The method suggested by Chen [5] used a different approach. Here we 

must use some sort of procedure for counting left zeros in 1 - x^. We write 

1 - X. = 2 ^i + V., where 0 < v. < 2 Then p. is one plus the number of 

1 1 1 1 

left zeros in 1 - x. . With a. = 1 + 2 we have 

1 1 

x.^, = (1 - 2~Pi - v,)(l + 2"^i) = 1 - v.(l + 2"^^) - 2~^^i . 

1+1 1 1 

Thus we see that the transformation has eliminated the leading non-zero bit 

of 1 - x^, and leaves the succeeding bits largely unchanged. Thus, in no 

-N-1 

more than N + 1 iterations, 1 - x^ < 2 . On the average, one expects 

only about N/2 iterations would be necessary. The error bound here is the 

same as for the first two ideas presented. 

However, Chen suggests a termination algorithm, to be applied when 

p^ > N/2. It is simply a Taylor series expansion about the then current 

point. In this case, if we have computed > • • • > > with 

p > N/2, then y/x ~y +y(l-x). This requires a multiply, however note 
*^n •^n n n 

that since p > N/2, 1 - x is a half precision number (i.e., the most 
n n 

significant N/2 bits are zero), thus costing half a usual multiply time. 
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The truncation error bound is 2 . Chen notes that since the error is of 

a definite sign, we could halve the truncation error by computing 

y/x = y^ + (1 - . 

The roundoff error is halved, of course, since at most N/2 iterations 
need to be performed. Hence log^N guard bits are sufficient with chopping. 
The expected number of iterations is N/4. 



4.1.4 Quardratically Converging Normalization Methods 

The methods of Section 2.1.3 converge linearly, that is, the number 

of iterations required depends linearly on the number of bits of accuracy 

required. Quadrat ically convergent normalization methods are based on the 

Newton method for solving 1/z - 1 = 0 with initial guess z = x = x, 

o o 

where x is the divisor. The calculations can be arranged as y^ = y, 

X = X, then for i = 1,2,..., n, 
o 






X 



i-1 



y 



i-1 



(2 - X 



i-1 



(2 - 



) 

) . 



The error 1 - x^ = = (1 ** x^) . Thus convergence is critically 

tied to the accuracy of the "initial guess". We must have | 1 - x^ | <1 for 
convergence. 

We see that if 0 < x <2, convergence is assured, hence x = x is 

o o 

adequate for convergence, with x in [1/2,1). 

We pause to consider the number of iterations required. If we want 

E < 2 then n must be taken so that (1 - x < 2 or 
n o 

.n N 

-log2|l-Xo| 
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For X in [1/2,1), (1 - x ) < 1/2, thus no more than log^N iterations would 
o o 2 

be required. 

We could improve this a little by some initial adjustment of x and y. 

Thus we might take x^ = ax, y^ = ay, where a may or may not be a function 

of X. To avoid an actual multiplication here, we could take 

/ 

2 if 1/2 < X < 2/3 



1 if 2/3 < X < 1 . 
N , N 



Then |l - x I ^1/3, and log^ ^ - log 

o 2 log 3 



2 1.58 



iterations would then 



be required. For N = 20, five iterations would be required by the previous 
procedure, and four iterations plus the initialization for the latter pro- . 
cedure. This example is biased somewhat, however. Note that if N is a 
power of two, no advantage is had by first initializing as above. For N 



a power of 2, one would need to initialize so that |l - ax| ^ 2 to 

save k iterations. 

In general the roundoff error doubles at each iteration. Thus for 
' -N-1 

roundoff error to be less than 2 requires that J guard bits be used, 
where 2^(2 ^ < 2 ^ or J = n + 1 with chopping. 

Krishnamurthy [29] has considered solving 1/z - 1/K - 0 instead of 
1/z - 1 = 0. For a simple recursion for the x^’s, this necessitates K = 2 
for some integer r, although K = 2/3 is possible. He also considers using 
lower precision multiplies for early Iterations. A table of initial trans- 
formations is given for computing a 48 bit quotient in 4 iterations (plus 
the initial transformation) . 

The idea of a table for the initial transformation is used in the IBM 
System 360 Model 91 [1]. With that machine 56 bit accuracy is desired, and 
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a table of 2^ = 128 entries is used and gives (1 - x^) < 2 Thus, after 



the initial transformation, only three iterations would be required. However, 
in order to speed up the early iterates, low precision multiplications are 
used with truncated intermediate results, and four iterations lead to 64 bit 
accuracy. 

Ferrari [19] proposed a method he calls the optimized geometric series 

(OGS) method. If the two multiplications are to be done serially, his method 

is faster, since convergence is superquadratic. If the multiplications can 

be done simultaneously, his method has no advantage. 

Shaham and Riesel [58] suggest a modification of the Newton iteration. 

”lc "Ic 

Suppose that 1 - = 2 ^ 0 < v^ < 2 . Then normally one would 

-2k . -2k 

have 1 - = 2 + with 0 ^ ^ • They suggest using as a 

-2k 

multiplier, 2 - x^ + p2 , instead of 2 - x^, where p is a table-look-up 
value depending on the first few bits of v^. They show, for example, that 
a 56 bit result can be obtained in one less iteration than with the classical 
method, starting with an initial result accurate to six bits. 

Ling [33] has given a version of this algorithm which was purportedly 
designed especially for 32 bit accuracy, and yields the quotient in 3 
multiplication times. Ling's method is to first transform the divisor. 



with the eighth bit equal to zero. We see that IT 6 g = 0, the plus sign is 
taken, and d, =6.. If 6 o “ ^he minus sign is taken, and .d-..d = 

1 1 o 1 / 



there is an apparent difficulty here, which washes out later on. Ling 
implements the above via logical circuits rather than by actual addition 



—8 

X = + 2 (. 6 q ...6 ) into a form x = .d, 

Z i o y n X 



... 



—8 

d_ ±2 (. do . .d ) , i. e. , 

/ y n 





and complementation. The initial transformation is then y/x = ^ 



where 
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k = 1/2 ' (,d^. .d^) ^ is obtained by table-look-up. We then have x(2k) = 

1 ± 2 ^k( . dg. . . d^) , the ± having been determined previously. Ling indicates 

”8 ”32 

that the denominator is within 2 of one, thus within 2 of one after 
two Newton iterations. This is not the case however, as can be seen by 
considering x = . 1000000011. •• 1, whence k = 1, and 2xk =1+2 ^(.111...!). 
Thus Ling’s algorithm is good for at least 28 bits, but one cannot be sure 
of more than 28 bits. 

4.1.5 Other Methods 

Dean [10] proposes a reciprocal generator which forms the reciprocal 
of a number in a controlled register as the number is entered serially into 
a shift register. Formation of the quotient bits is done by means of logical 
circuits. He gives complete information for four, five, and six bit numbers. 
The logic would seem to become quite complicated for high precision numbers. 
However, if, as Dean says, the size of connectors negates the possiblity 
of components becoming much smaller, but rather their complexity increasing 
to take up (some of) the available space, such a device might be feasible 
where speed is the main consideration. However, I am assuming the logical 
circuits could be implemented so as to be very fast. 

4.2 Arctangent 

Several authors have directed their attention to calculation of the 
arctangent. While it is not readily apparent when glancing through the 
literature, it is in fact true that only one algorithm has been developed, 
so far as this author has been able to determine. The CORDIC method was 
the first to appear, followed by Meggitt [40] and Sarkar and Krishnamurthy 
[55], Specker [61], and DeLugish [16]. The algorithms differ mainly from 
the CORDIC algorithm in that not all of the rotations are performed. 
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Meggitt keeps > 0, Sarkar and Krishnaraurthy minimize jz^l, Speaker’s 
method coincides with the CORDIC algorithm, and DeLugish uses a decision 
process similar to that for division. 

We assume that tan ^y/x is to be computed, where x + 0. We will generally 
assume that a first quadrant angle is to be computed, hence an initial trans- 
formation using the identity tan ^(-y/x) = - tan ^y/x may be necessary. For 
the CORDIC algorithm, we must have x > 0, so the sign is attached to whichever 
of X or y is appropriate. If the result of the calculation must be in the correct 
quadrant, additional care must be taken with that step, and furthermore, an 
additional rotation through tt radians may be necessary. In the CORDIC 
algorithm this can be accomplished with no post-manipulation of the data. 

4.2.1 CORDIC Algorithm 

As Table 1 indicates, tan ^y/x is obtained by the algorithm with m = 1 

in the vectoring mode. Since we have discussed the general procedure in 

some detail in Section 3.1, we need only to discuss the truncation error 

bound for this particular calculation. As we indicated, the magnitude of 

2 2 1/2 

y„ can be as large as x^^ a in vectoring mode. Now x__ - K_ (x + y ) * 

Nj^+1 1 o o 

2 21/2 — N“ 1 

so |y, , ^ 1 < K- (x + y ) 2 . Consideration of equation (6) and some 

‘ N- +1 ' 1 o o 

algebraic manipulation yields tan a = y /x - — k cosa* have 

^O 1 

a = tan"hy^/x^ - y] = , 

1 + r 

where r is between y /x and y /x - y . Thus the truncation error is 

o o o o 

_ Y I 

2 ~ 2 , 2 * 

1 + r 1 + y /x 

o o 
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Then 



JL 



2 2 ' 
1+y /x 
•^o o 



2 ^ , 2_^ 2.1/2^-N-l 

X . K- (x +y ) 2 

o 1 Q 

2 2 

X K- cos a(x +y ) 
o 1 o ' 



X 2 
o 



-N-1 



/ 2 2 . 1/2 
vx +y ) cos a 
o '^o ' 



But — 2 2 ~/ Y " hence the truncation error is no more than 

-^o > 

-N-1 

(approximately) 2 . Roundoff error is likewise bounded, hence total 

error is no more than 2 

We will consider the problem of initializing for both principal values 

of the arctangent and for values in the correct quadrant, the quadrant 

determined by the signs of y and x. 

For principal value we let 

X = X sign X 
o 

y^ = y sign x 



z = 0 



o 

For the correct quadrant to be obtained, we take 



X = 

o 


X sign X 






y sign X 






1 TT if 


sign X < 0 


z = 


< 




o 


0 if 


sign X > 0 . 



One can also accomplish the initialization by always doing an initial 

rotation of tt/2 radians, clockwise if y > 0, and counterclockwise if y < 0. 

This will also force x > 0. Then 

o 

X = y sign y 
o 

y = -* X sign y 
z = tt/2 sign y 
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4.2.2 Modified CORDIC Algorithm. 

We will discuss the various modifications of the CORDIC algorithm in 
a unified manner. Basically, all of the modifications can be put in the 
same form as equations (3) - (5), except that the decision for s^ is different. 
This will change the radial distortion constant, but it does not enter sig- 
nificantly in this case, anyway. 

Meggitt takes 

sign if |y^| ^ 

"i = S 

0 otherwise 



Sarkar and Krishnamurthy take 

sign y. if |y. - sign y.*6.x.| 
1 111 ' 

"i = ^ 

0 otherwise 

\ 

DeLugish considers tan x, 0 ^ x < 1, and takes 




s . = 
1 



+1 if y. < - 3/8 
1 

■j -1 if y^ > 3/8 

, 0 otherwise. 



with the initialization procedure 

= 2^(x - 1) , = 2^(x + 1) , 

with 8 chosen so that x ^ is in [-1,0). 

Then y , is in [1,2). We set z = tt/4, 

—1 o 

X = x,t,y = y,t , where 
o -1 o o -1 o 
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3/4 if 



0 < X < 1/4 



t = <5/8 if 1/4 < X < 1/2 

o 

^ 1/2 if 1/2 < X < 1 

Note that the sign of the s. is different because x. < 0 for his initialization. 

1 1 

The error bounds for all of the above are the same as for the CORDIC 
algorithm, and the roundoff error will probably be smaller in practice because 
the average number rotations is less than N+2. For the DeLugish method the 
average number of rotations (number of non zero s^) is about N/3. 

4.3 Cosine/Sine 

As with the arctangent algorithms, all of the algorithms for cosine/sine 
reduce to the CORDIC algorithm, or a variation of it. We consider that the 
angle, 0, could be any angle. This is then reduced to the range [0,27 t). 



and then 



sinZ=sin0=^ 



and 



cos Z = cos 0 = 



loted by 


Z. 


Then we can find 


z in [0,7 t/2) so 


Q = 0,1, 


2, or 3. 


We compute 


sin z and cos z. 


sin z 


if 


Q = 


0 




cos z 


if 


Q = 


1 




-sin z 


if 


Q = 


2 




-cos z 


if 


Q = 


3 




cos z 


if 


Q = 


0 




-sin z 


if 


Q = 


1 




-cos z 


if 


Q = 


2 




sin z 


if 


Q = 


3 
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The proper values can be obtained either by pre - or post - manipulation 



of the data. 



-1 



4.3.1 CORDIC Algorithm 

By Table 1 we see that if one enters the algorithm with m = 1, x = 
y = 0, and z in the rotation mode, one obtains cos z and sin z for |zl < tt/2. 
From equation (6) 

y^ -fl ” ^1^*^1^ a) = - sin a . 



Now |z + a| = ^ thus 



= cos (z - y) = cos z 4* sin z* • Yj 



+1 ” (z “ y) = sin z - cos z • y> 

where z* and z are each within IyI of z, and IyI - = tan ^2 ^ ^ - 2 ^ 

Then, since sine and cosine are bounded by one, the truncation error is 

-N-1 -N-1 

bounded by 2 .If roundoff error is also bounded by 2 , total error 



is bounded by 2 



-N 



4.3.2 DeLugish Modification 

Because the distortion constant depends on the rotations performed, 
the usual DeLugish [16] modification will not work. Recall that the distortion 
in the step 

X. , - = X. + S.6 .y . 

1+1 1 1 i"^i 



y..i = y. ” S.6.X. 

-^1+1 1 ill 

2 1/2 

is (1 + 6^ ) . A correction to remove the distortion would require 
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2 1/2 

division of each variable by (1 + 6^ ) . This is not feasible, of course. 

2 -1/2 2 4 

But (1 - 6^ ) = 1 - 1/2 6^ + 3/8 6^ and if 6^ is small enough for 

^N-1 2 -1/2 2 

the third term to be less than 2 , we have (1 + 6^ ) = 1 - 1/2 6^ , 

rounded to N digit accuracy, and such a correction could be made with a 

2 - 1/2 

shift and subtract, since 6^ is a power of two. Eventually, (1 + 6^ ) = 1, 

rounded to N bit accuracy, and then no correction is required. This idea 
is used by DeLugish to introduce redundancy into the cosine/sine calculation. 

We consider 0 ^ z < 7 t/ 2. The initialization is 

z if z < 7 t/4 

7t/2 - z if z > 7t/4 




and 






(l/K* , (1/K*) tan ir/8) if z ^ ir/4 

A 

((1/K*)tan tt/8, 1/K*) if z > i:/4 

V 



z 

o 



tt/4, 



where K* = (1 + tan ir/8)^^^ n (1 + . 

j=0 

Here N = the number of iterations to be performed before shift 

and subtract corrections can be made for the radial distortion. The iteration 



equations are 


^i+i 


<■‘1 - 




^i+i = 


(y. + s. x.2~^"^)T 
1 11 




^i+l " 


-1^-i-l 
z . - s . tan 2 
1 1 



where s . 

1 



is determined by 
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s . = 
1 



-1 if z. < 0 
1 



1 if > 0 



for i = 1,2, . . , N , 



s . = 
1 



■1 

1 

0 



if < - 3/8 2 



-1 



if z^ > 3/8 2 



otherwise 



-1 



for i = N + 1,..., N 
and 



T. = 
1 



1 for i = 1,2, ... ,N 



, 2 ^-(2i+3) * ** 

1 “ 2 ^ ^ for i = N +1, . . . , N 






for i = N +1,..., N . 



** N-3 2-1/2 

Here N = ["“^] + the point at which (1 + 6^) =1, toN bit accuracy. 

The above was as given by DeLugish. DeLugish, however, did not consider 

the effects of roundoff error, and his examples were done with N = 40, and 

at least 13 guard bits. For purposes of roundoff error control, the values 

-k kk 

of N and N should be determined on the basis of N + J bit accuracy, rather 

* ** N+J-6-1 

than N bit accuracy. Thus, one should take N and N to be [ — — ] + 1 
and + 1, respectively. 



The truncation error is similar to the CORDIC algorithm. The roundoff 

error bound is complicated by the fact that about iterations are performed 

which may require corrections T^, with T^ + 1. It is conceivable that nearly 

all rotations (all s^) will be non-zero, thus the roundoff error bound will 

be about 25% larger. However, the average number of s^ equal to zero should 
3N 

be about — r , so on the average, roundoff error will be smaller, 
o 
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4* 3, 3 An Alternate Procedure 



We have seen in the previous section that the radial distortion 

constait complicates the usage of no rotation at any step. If one were to 

use a scheme such as the DeLugish scheme for computing the tangent (See 

t an z 

Section 4.7.2), the sine could then be computed from sin z = ^ • 

(1 + tan z) 

This type of algorithm was implemented in the Hewlett-Packard HP-35 calculator 
[6]. In addition to the fact that the calculations are radix 10, the goal 
there was compactness of the program, rather than speed. The above identity 
is certainly not likely to achieve high speed. 

4.4 Exponential 

The calculation of the exponential function has been treated by several 
authors. Normalization is the principal technique, although the other unified 
algorithms also can be used. 

We will generally allow the argument to be any number, X. The range 
will be reduced by the transformation X = Q log2 + x, where Q is an integer, 
and 0 < X < log 2. Then e^ = e^ 2 + x _ 2^e^. Note that 1 < e^ < 2. 



4.4.1 CORDIC Algorithm 

Table 1 yields the information that entering the algorithm with m = - 1 

in the rotation mode, taking x = 1/K^^y y = 0, and z = argument, the result 

2 

of the calculation will be x^ +1 " cosh z , y^ +1 " sinh z. Then e = 



cosh z + sinh z 



Xi+i Xi+1 ■ 



We will use log u to denote log u = In u, while any other base will be 
specified by subscript. 



- 37 - 



The truncation errors are given by 



- cosh z = cosh a - cosh 



= cosh (z “ z^^ n ) “ cosh z 
o 



= “ (sinh z ) z 



N ^^+1 ^ 



and similarly, 



y ✓ “ V ^ 

N_^+l - sinh z = - (cosh z) z^^ , where z 

2 

z and z - z.. . Then, the truncation error for e 

N ^+1 



and z are between 
is the sum. 



and 0 ^ z < log 2, the bound is 2*2 ^ ^ = 2 Roundoff error in each is 

-N-1 -N 

2 , hence the total roundoff error could be as large as 2 , giving 

-N+1 

a total error of no more than 2 . However recall that a right shift is 

-N 

necessary to normalize the result, thus the error is then bounded by 2 



4.4.2 Pseudo-Division/Multiplication Methods 

From Table 4 we see it is first necessary to recode the exponent in 

in the variable radix log (1 + 2 This is accomplished by the modified 

divider discussed in Section 3.2. Let us first investigate the error in 

the process of recoding the exponent. Without rounding error, and chopping 

after N bits it is easily seen that the truncation error in the process is 

-N -N 

bounded by log (1+2 ) ^ 2 . The roundoff error is introduced solely 

through the divisors, and is left shifted in the new dividend at each 

iteration. If the error in divisor d. is € . , we have error 

1 1 

- 2 ^ - 2 ^“^ - - 2 

^ 0^0 ^1^1 *’* in Thus the error accumulation in 

N+1 

N+1 

remainder is the above, divided by 2 . Assuming correct rounding of the 
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divisors the roundoff error accumulation is bounded by 



2-“-^2"+ ... + 2)2-”-l i - 2-"-l. 

-N“l 

Thus if no guard bits are used, roundoff error is less than 2 . Thus 

-N -N-1 

the total error in this procedure could be as large as 2 +2 . If we 

P”Y I I “N 

take y = X = 1 in the procedure, we obtain e , where |y| < 3/2 • 2 

Thus the error here is e^ - e^ ^ = e^ y , where p* is near p. Thus the 

error is bounded by e^°^ ^|y| -2 • 3/22^=3 • 2^. 

Now in the pseudo-multiplication, roundoff can occur only in the 

calculation of the pseudo-multiplicand, x^. The error propogated to the 

N-1 N 

pseudo-product then is 2 2 where is 

I 1 “N 

the error in x^^. = 0, and |£^| < 2 , so the total error is bounded by 

N N-1 -N 

(2 +2 + ..• + 2)2 < 2, Because the accumulation is in a double 

length register (the least significant part being N + 1 bits), we see that 

-N-1 -N 

the actual error is no more than 2*2 = 2 . 

-N -N 

Thus the total error in the procedure is bounded by 3 • 2 +2 = 

-N+2 

2 • Recall that a right shift is necessary for normalization which 

j i_ o“N+l 

reduces the error to 2 

We note that the above was accomplished without guard bits, and one 

double length register. The inclusion of one guard bit would reduce this 

-N 

error to 2 . However it would be necessary to include an additional 

quotient bit in the modified division to recode the exponent, since this 
was the largest source of error. 



4.4.3 Normalization Methods 

The basic algorithm has been described previously, the choice of a^ 
being the issue to be decided. Specker [61] and later Perle [46] use the 
criterion 
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a. = 
1 



1 + 2 ^ ^ if - logd + 2 ^ 0 

< 

1 otherwise. 



Then a 0 for all 1, and It is easy to show that x^ < 2 thus after N 

-N-1 ^+1 X 

iterations ^ 2 . Then we have e = y e , and taking 



%+l ~ gives an error of y^^^ 



'N+1 

, ^+1 ^ X ,-N-l 

= yu+l • *N+1 ^ 2 



-N 



Since 0 < x < log 2, the truncation error is seen to be bounded by 2 

-N-1 

Roundoff error is bounded by 2 , assuming a sufficient number of guard 

bits, and since the result must be right shifted to normalize, the total 
error is bounded by 2 +2 < 2 . 

DeLugish [16] initializes by taking 
= X - log e^ 



= e 



where 



Then a. 
1 



e = 
o 



1 + s . 2 ^ ^ is 
1 






1/2 

e 


if 


/ * 
X > 1/2 


1/4 

e 


if 


1/4 < X < 1/2 


1 


if 


X < 1/4 


chosen by 


-1 


if 


X. < - 3/8 • 


1 


if 


1 

x^ > 3/8- 2"^ 



0 otherwise 



It is shown that |x^| ^ 3/8*2 hence “ 3/8*2 so truncation 

error is slightly smaller than in the previous case. The number of non-zero 



DeLugish actually allows |x| < log 2, and thus two more possible initilization 
values are given by him. 
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is about N/3. One can again show that (for moderate i) that no more than 

two successive s. can be non-zero, thus the maximum number of non-zero s 
1 i 

is about 2N/3, decreasing the number of guard bits required. Total error 
IS bounded by 3/4*2 +2 =5/4*2 <2 

Chen [5] lets be one plus the number of leading zeros in x^, i.e., 

x^ = 2 ^i + v^, 0 < < 2 ^i. Then log(a^) = log(l + 2 ^i) = 2 ^i + 

0(2 ^^i) , and the leading one bit in x^ is eliminated. At most N + 1 

-N-1 

iterations will be required for x^ < 2 . The termination algorithm, to 

be applied when p >N/2ise ::y +xy. Again, as with division, 
n •'n n n 

truncation error can be halved by taking e^::y +y(x + 2^^). The 

n n n 

-N 

total error bound is again less than 2 



4.5 Power Function 



The power function, x^ , has not received much attention in the literature. 

y V log X 

This function is usually evaluated by the identity x = e , which 

requires evaluating both a logarithm and an exponential, with a multiplication 
necessary also. 

A direct method was proposed by Krishnamurthy [28]. He assunied the 
numbers were in radix r, but we specialize this to r = 2. We first write 
y = I + f, where I is an integer and 0 ^ f < 1. x^ can be evaluated by 



n 



r ~1 

multiplications and a division, possibly. Let us write f = /^ q.2 . Then 

i=l ^ 

’ n , 

f ^ q.2 ^ ^ 

X = X Y ^ 1 

1-1 



. n I’i 



1/2 1/2 

If we let Zj^ = X , = z^ , for i = 1,2,...,N - 1, we have 

f N f 

X = n z i . We can determine x with the following loop. 
i=l ^ 
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Let p 



o 



1, z 



o 



X, then 



for i = 1,2, . . . ,N 




f 




if 







Then x^ :: 



This method would seem to be a rather time consuming procedure, s'ince 
N square roots and up to N multiplications are necessary for only the 
fractional part of the exponent. It seems unlikely that it would compare 
favorably with the conventional method, unless a square root operation 
were available, and a logarithm/exponential routine was not. 

Roundoff error would be able to come in both during the square rooting 
operation and during formation of the product, and would thus require more 
guard bits than usual. 

4.6 Logarithm 

Since the logarithm is the inverse of the exponential, it seems reason- 
able that the inverse of methods used for the exponential could be used for 
the algorithm. This is essentially true. Most routines are for any base 
logarithm, depending on the stored constants, but one due to Philo [48] and 
Dean [15] is for base 2 logarithms. 

We will consider the logarithm of X, where X is any positive number. 

We then write X = 2°^ • x, where 1/2 < x < 1, and then log X = a log 2 + log x. 
It is then necessary to discuss the computation of log x. 
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4.6.1 CORDIC Algorithm 



—1 V “1 

Here we will make use of the identity log w = 2 tanh — rr • If 

w+1 

1/2 ^ w < 1, then -1/3 ^ 0> ^ region for which the CORDIC algorithm 

converges. We enter the algorithm with m = - 1 in the vectoring mode, with 



z=0, x=w+l and y = w - 1. The value of must then be left 



shifted to obtain the result. 

Truncation error is found by consideration of equation (6), by a 

procedure similar to that for the inverse tangent, and we obtain the bound 

6 . Roundoff error is of similar size, however, these bounds are for the 

N T 



hyperbolic tangent, which must be left shifted to obtain the logarithm 
thus the error bound is 2 



4.6.2 Pseudo-Division/Mulitplication Methods 

This is a pseudo-division process, and can be carried out in one 
pass, as is seen from Table 3. Normally, log z is required, and this can 

1 “Z 

be obtained by setting y = l- z, x=z. This gives log(l 4* y/x) = log(l + ~ 

- log z. This particular transformation keeps x and y small and non- 

negative, as is desirable and necessary. 

N+1 

Using no guard bits, we find that the error in is -2 " 

N I I -N 

2 q-£. - ... - 2q„£^, . Since £ =0, and £, < 2 , we have the error 

^11 N o ' 1 ' 

bounded by (2^ + 2^ ^ + . . + 2)2 ^ = 2. The pseudo-remainder is z^^^/2^"^^, 

-N 

hence the error is bounded by 2 . Now the accumulation of the psuedo- 

dividend requires the use of guard bits. If it is done in the double length 

register, as suggested by Meggitt, roundoff error will be insignificant, and 

-N 

the total error will be less than 2 
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4.6*3 Normalization Methods 



Speaker [61], Perle [46], DeLugish [16], and Chen [5] have suggested 



these algorithms, and the selection of a^ is the same as for division, since 
in each case it is desired to force to 1. The error bounds are about 



the same. If |l - x^| ^ 2 ^ then the error is log|l “ z 2 ^ 

-N-1 

before. Roundoff error again can accumulate up to 2 for a total error 



-N-1 



as 



of 2 



-N 



Chen's termination algorithm is y + log x :: y - (1 - x ) , or to 

n n 

-N-2 

reduce the truncation error, y + log x-y -(1-x +2 ). Again n 

n n 

is taken so that p^ > N/2. Comments pertaining to errors, made in Section 
4.1.3* apply here also. 



4.6.4 Other Methods 

Philo and Dean suggested the following method for computing log 2 X. 
It is the inverse of the procedure described in Section 4.5 for computing 
It was discussed by Dean in terms of implementation using cellular 
arrays. 



We want to compute digits such that x = 2 



N 

1 q^2 

1=1 ^ 



-i 



We assume 



that 1 ^ X < 2 so that 0 ^ log 2 X < 1. Then x = 2 ^^^ ^ , 

1 / 2 ^ 2 2 
• 2 . If X S 2 , then it is apparent that = 1. If x <2, then 

= 0. The following recursion can be seen to generate the q^. 

X = X, for i = 1,2,..., N, 
o 



^i = 



if 



if 



X >2 

i-1 ■ 



X, , <2 

x-1 
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and 



X, = 
1 






Vi 



if q. = 1 
1 



if q. = 0 



-N 



The truncation error of the above procedure is 2 if enough guard bits 
are carried in the calculation of the x^. The error will generally more 
than double at each iteration, although a possible right shift will tend 



to keep it about the same magnitude. For example, if the error at the 



th 

i-1 stage is then 



e . = 
1 



1/2.(2x^_^£..j + ) i£ q. = 1 



2x , - e . - + c . - 
i -1 1-1 1-1 



if q. = 0 
1 



1/2 1/2 

Considering that q^ = 1 or 0 as >2 or x^_^ <2 , we have 



c . < 

1 



< < 



1/2-(2*2-€. . + €. = 2e. T + l/2e. - 2e . 

1-1 1-1 1-1 1-1 1-1 



O ol/2 X 2 .3/2 . 2 .3/2^ 

2*2 =2 ^2 £. T 

1-1 1-1 1-1 1-1 - 1-1 



This does not include the error due to chopping after the squaring operation, 

-N-J 

which would be bounded by 2 , using J < N guard bits. e is zero, so we 

have |e I < + €p = 2"^”'^ • Then |e [ < ^ + 2 ^ = 

1 

3/2 

(2"^ +1)2 . Continuing in this fashion, or by induction, one obtains 



|e^l < [(2^/^)^ ^ + (2^/^)^ ^+...+ 1]2'^ \ 



or 



le^l < (2^^^)'^* 2 ^“‘^= 2 ^^^ . Thus the use of N/2 guard bits 



- 45 - 



-N 

will prevent buildup of error larger than 2 

Because repeated squaring is required, this method is probably not too 
attractive, even though it is a simple calculation. 

Nicoud and Dessoulavy [45] suggested a method which requires repeated 

multiplication. Suppose log x is desired, where 1 < x < 2. Now, for an 

”lc n 

appropriate number u = 1 + 2 , we consider the sequence l,u u , until 

u" S X. Then we have log x » log u^ = n log u. The truncation error is 

I log u’^ - log x| ^ log u’^ - log u'^ ^ 

= log u = 2 

hence we would probably take k = N + 1. The error introduced by forming 
the sequence l,u,...,u'^ is bounded by n 2 ^ Unfortunately, for x : 2, 
we would require 



^ log 2 
log(l+2"”"^) 



3 * 2 ^ ^ 



9 



thus the number of operations required for high precision is excessive. 
About N + 1 guard bits would be required. 



4.7 Tangent 

Several authors have discussed generation of the tangent function, 
including Meggitt [40] and DeLugish [16]. Of course, the tangent can 
always be generated as the quotient of the sine and cosine, say from the 
CORDIC algorithm. Both Meggitt and DeLugish use this idea, with modifications. 

We assume that the angle is reduced as in the sine/cosine routines. 

Thus we have a first quadrant angle. This is further reduced by the identity 
tan 2 = cot(7r/2 - z) if z > tt/4. 

Meggitt ’s initial modified division is equivalent to finding which 
rotations need to be done to drive the angle to zero, and doing only those 
which do not **over-shoot” zero. The arbitrary value, Q, should be approxi- 
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inately of full register length, near one. The accumulation in the double 



length register renders roundoff error to be insignificant. Error in 
recoding the angle is about the same as was discussed in the case of the 
exponential function, and one guard bit and N + 1 iterations (n = N + 2) 
are required to bring the truncation error bound down to the desired level. 

DeLugish use the same initilization as for sine/cosine, and the same 
decision process as is used in the latter stages of that procedure. Of 
course, the radial distortion correction is not necessary for any of the 
iterations . 

The errors in calculation of the sine and cosine were found to be 



,-N 

K sin z - € 



bounded by 2 , and these bounds hold for the modified procedures here. Then 



K cos Z “ € 



^ ^ (tan z “ e^/K)(l + 



K cos z 



e^tan z 

tan z - €^/K + — 



The approximate error is then bounded by le. I + ^ — < (1 + v^)2 

'1' cos z 

However, separate consideration of the truncation and roundoff errors in 



K sin z and K cos z will yield a better bound, about (1 + l//2)2 an error 
confined to the last two bits of the result. For angles greater than tt/ 4, 
the reciprocal is taken, and the error becomes arbitrarily large. This is 
natural, however, since the value of the tangent also becomes arbitrarily 
large. 



4.8 Square Root 

Computation of square roots has received a great deal of attention , 
Lenaerts [30] was one of the first to discuss the possibility of a build-in 
square root operation; his suggestion was for a particular type of existing 
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computer, and was an adaption of the usual hand method for extracting square 
roots. Cowgill [8] gave a non-restoring version of the usual hand method. 
Kostopoulos [26] and others have also suggested adaptions of the basic hand 
calculation. Metze [42] gave an algorithm which generates the square 
root in a minimally represented redundant form. The pseudo-multi- 
plication of Meggitt [40] and Sarkar and Kirshnamurthy [55] will calculate 
square roots, as will the CORDIC algorithm. Chen [5] and DeLugish [16] give 
normalization techniques for the calculation. DeLugish gives two algorithms 
for the square root. One is an additive normalization procedure which is 
actually a modified hand method. We will discuss iterative techniques briefly. 

We assume that X = 2^^. x, where 1/4 < x < 1, then = 2^. v^, and 

1/2 < < 1. We concern ourselves with the computation of . 

4.8.1 CORDIC Algorithm 

Suppose ✓w , 1/4 ^ w < 1 is to be computed. Then, setting x = w + 1/4, 
y = w - 1/4, and entering the algorithm with m = -1 in the vectoring mode 

V 2 2 

X -y = K . Unfortunately, a multiplication 

by K ^ ^ is then necessary to obtain i/w , taking an additional multiplication 

-2 

and increasing roundoff error. An initial formation of wK ^ could be done, 

-2 -1 
also, but would require that be stored, as well as 

The truncation error for the calculation of K ^ is a lengthy calcu- 



lation, and is approximately - ^-1 -^+3/16) ^ ^ ^ ^ tanh"^ ^ -_l/4 



(w+1/4) cosh ot 



-1 



w + 1/4 



-N-2 

Hence the truncation error is bounded by K_^* 2 . Roundoff errors are the 

same as for other applications of the CORDIC algorithm, so the error in 
K_^ }/w is bounded by 2 ^ ^ 4- K_j^2 ^ The division (or multiplication) 

would Introduce additional error. 
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4.8.2 "Hand” Methods 



The usual hand method for extracting square roots is a matter of 
trial and error. Let us outline and review the procedure for radix two. 



Suppose we have a number x, 1/4 < x < 1, and we have computed digits 

i-1 

is as large as possible, and still 



so that R 



= 5^ q.2 ^ 

j-i j 



satisfies Then we take q^ to be zero or one, depending 

on whether x - (R^ ^ + 2 is negative or non-negative. If we 

2 

let E. = X - R. , then 
1 1 

E. = X - (R. , + q.2^)^ = E. , - q.2”^‘''^(R. , + q.2"^"^). 

1 1-1 ^1 1-1 1 1-1 ^1 

Thus, we see that a trial value of can be computed by subtracting 

2 + 2 ^ from If the result is negative, q^ = 0 and 

E^ = E^ If the result is non-negative, q^ = 1, and E^ has been computed. 

For machine implementation it is more convenient to deal with 

M. = 2^E., rather than E.. The calculations can then be arranged as 
111 

follows. Since 1/4 < x < 1, q. = IJ thus we can initialize 



= 2(x - 1/4) 



R^ = 1/2 

For i ■= 2 , 3 , . . . ,N 



T. = 2M. , - (2R. , + 2 ^). 
1 1-1 1-1 



Then, if T. > 0 
1 

qi = 1 



M. = T. 

1 1 

R. = R. + 2 
1 1-1 



-1 



or 



T. < 0 
1 

qi = 0 

M. = 2M. , 
1 1-1 



R. = R. , 
1 1-1 
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This is seen to be a restoring type of process, since a trial value is 
computed, and then may be discarded. 

-N 

It is easily seen that the truncation error is 2 , since we have 

computed the first N bits of the square root, and the remainder, 



i=N+l 






-1 



< 2 



-N 



As we have outlined the procedure, no roundoff error 



occurs since no numbers are right shifted off the register, and all other 
numbers are represented exactly with N fraction bits. We should note that 
can be large enough to require some bits to the left of the binary point, 
however. This might be best handled by scaling and inclusion of an appro- 
priate number of guard bits. 

The above procedure can easily be converted to a non-restoring square 
root routine. The version we give is not generalized the same way as the 
non-restoring division routine, however. 

Let = 2(x-l/4) 

= 1/2 

For i = 2,3, ... ,N 



then 



R. 

1 

M. 

1 




if 


^i-1 


> 0 


if 


^i-1 


< 0 



R + q.2 
1-1 ^1 



-1 



2M. , - q. (2R. - + q.2~^) 
1-1 ^1 1-1 ^1 



i 



I 



The error bounds are the same as for the restoring square root procedure. 

Metze has given a non-restoring algorithm which yields the value in ^ 
a minimally represented redundant form. As with the Metze division algorithm [41] 
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there is overlap in the regions where a digit may be chosen as zero or 
non-zero. We outline the algorithm as given, and then discuss the overlap 
and a possible compromise. Let 




Here will correspond to the partial remainder x-S^ , where S. is the 
approximation to the square root after i steps. 

For i = 0,1, ... ,N 



° = 2 ^(±5/3-s^_^ + 25/36-2 



K 



^ ^ = 2 + 4/9-2 



Determine by 



'll 



0 if K."^ < R < K.'*’® 

1 1-1 1 

1 if < R. , 

1 1-1 

-1 if K."^ > R. , 

1 1-1 



Then 



R. = R. , - q.2 ^(2S. , + q.2 
1 1-1 ^1 1-1 ^1 



S. = S. , + q.2 
1 1-1 ^1 



-1 



Since K. ^ < K. ^ , it is not well defined in the above 

1 1 1 1 

how q^ is to be chosen if overlapping regions. It 

can be chosen either way one prefers; the representation will still be 

iO f 1 

minimal. Since the above values of and are not particularly easy 

to compute, a reasonable suggestion would seem to be to compromise at some 
point in the regions which is easier to compute. Such a value is 



=2 + 2 ^ . Actually, in the above there appears to be a 
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25/36, and 



iO 

slight difficulty with the iteration i = 0 since then “ 

i 1 

Kq = 4/9. However, since > 0, the negative values are not required. 

The same difficulty appears in our suggestion, and is resolved the same way. 

With our suggested criterion for determining the and the replacement 
of by 2^R^, an algorithm for computing minimally represented square roots 
is as follows. We remove the apparent difficulty on the first iteration by 
initializing differently. 

Let ^ 1 if X > 1/2 

^o ^ 

0 if X < 1/2 

and M =x-S^=x-S 

0 o o 

For i = 1,2, ... ,N 

K." = ± 3/2-S. , + 2"^“^, 

1 1-1 

then determine q^ by 







1 


if 


1 


< 2M. , 
1-1 






-1 


if 


K." 

1 





0 otherwise 



Then M. = 2M. , - q.(2S. , + q.2 
1 1-1 ^1 1-1 ^1 



S. = S. T + q.2 
1 1-1 1 



-1 



The same comments pertaining to roundoff error made for previous 

methods applies here. Metze shows that the remainder R^ is bounded by 
±0 

, thus we have 



2”^(-5/3-S^_^ + 25/36-2 < x - < 2 ^(+5/3-Sjj_^ + 25/36 -2’^). 
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-N 2 -N 

Now, S-. - - S,,, and 2 << S,, , so x ~ is bounded by about 2 *5/3-S^^ < 

N-1 N N N N 

-N+1 

2 S^. The latter can be shown to be a rigorous bound for moderate 

2 

values of N. We are interested in - S^, not x - , but recalling that 



we have ~ 

X - S. 






N 

»6c + S 



N 



X - S. 



N 



2“^^^S -N 

< N = 2 



2S 



N 



2S 



N 



Thus the error is no greater than one in the last bit. 

DeLugish gives a similar procedure where the comparison constant is 

not a function of the partial result. Let U = x /4, then the initialization 

^ o o 



step is 

= 1/2 (x - r^^) 
1 o o 




where r^ is given in Table 8. The comparison constant c is also given in 
the table. 



interval for x 


r 

o 


J. 

c 


[1/2, 5/16) 


1/2 


3/8 


[5/16, 3/8) 


9/16 


7/16 


[3/8, 9/16) 


5/8 


1/2 


[9/16, 7/8) 


3/4 


5/8 


[7/8, 1) 


1 


3/4 



Table 8 
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For i = 1,2, .. . ,N-1 






{ -1 if U. < - 

1 4 

1 if U. > 7 - 

1 4 



0 otherwise 

and = 2 U^ - l/ 2 *s^(R^ + 2 "^"^) 



R.,, = R. + S.2 
1+1 1 1 



-i -1 



DeLugish shows that |v^ - R^| < 2 hence |v^ ~ R^| < 2 so 

-N 

truncation error is bounded by 2 . Because of the way DeLugish initializes, 

2 guard bits are required to avoid error in the initial calculation of U 

o 

and in the latter stages of the sequence. This is the same problem 
referenced earlier when discussing the magnitude of the partial remainders 
and a possible scaling with guard bits to contain the right shift. 

4.8.3 Pseudo-Division/Multiplication Methods 

Meggitt^s method can be used to compute square roots, as Table 3 shows. 

The only complication is that the recursion for has three terms, but 

this is common to most methods. Roundoff error will occur in the calculation 

of but as before, the effect is less than 1/2 in the Nth digit, or less 

-N~l 

than 2 , with no guard bits required. 

4.8.4 Normalization Methods 

Normalization methods are for the calculation of y/>^ , and is obtained 

by setting y = x. The procedures are similar to those for division, except 

2 

that x^ is multiplied by a^ , requiring a three term sum, as in the hand 
methods. We assume 1/4 < x < 1. 



DeLugish initializes by x^ = x^^ , = y, then 
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U- = r X - 1 
1 o o 



R 



= R /r 
1 o o 



where 



r = 
o 



4 if 1/4 < X < 1/2 
o 



1 if 1/2 < X <1 
o 



For i = 1 ,2, . . . ,N 



■ 



-1 if > 3/8 

< 1 if < 3/8 

0 otherwise . 



and then 



= 2U. + s. + 2 ^(2U.s.+l/4-s.^) + U,s.^-2 ^ 

1+1 1 1 'll 1 ii 

= R. (1 + s.2"^"^) 

1+1 1 1 



In the above, we note that a. = 1 + s.2 

1 1 

where x.,- = x.(l + s.2 ^ 

1+1 1 1 



-i-1 



and = 2 ^(x,^t-1), 



i+1 



I I “N 

DeLugish gives the error bound - 3/4*2 , and this gives a 

-N-1 

truncation error bound of about 3/8.2 <2 . Roundoff error bounds 

are the same as most other DeLugish algorithms. The average number of non-zero 



9 ^ is about N/3. It is not known what the maximum number of non-zero is, 
but in any case log 2 N + 1 guard bits are sufficient. 

The Chen algorithm again counts left zeros in 1 - x^ = 2 ^i ^ + , 

0 ^ V. <2 ^i"^^ ^ and p. is two plus the number of left zeros in 1 - x.. 









Then a. = 1 + 2 ^i , and l-x..^ = l- a.x. =1- (1+2 ^i) (1-2 i + V.) 
1 1+1 11 1 

= + 0(2 ^^i ^) , thus, again the leading one-bit is eliminated. 

The termination algorithm, again applied when p^ ^ is y/ + 

1/2 *y (1 - X ). For purposes of reducing truncation error, the termination 
n n 
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algorithm z + 1/2 ^ ^ ^ used. Truncation error 

-N-2 

is then bounded by 2 . The same comments about roundoff error made in 

other discussions of Chen's normalization algorithms apply here. 

4.8.5 Iterative Procedures 

2 

The classical iteration for the solution of Z - x = 0 is the Newton 
iteration, “ 1/2 - x/Z^) , with Z^ given as an initial guess. 

Convergence is quadratic. Many papers have been written concerning optimal 
starting values and other aspects of the iteration. As written above, a 
division is required. 

Ramamoorthy, Goodman, and Kim [49] have made a study of iterative 
methods which use only multiplication, except for a possible final division. 
The iteration 



R 



i+1 



R. . 

= ^ (3 - xR.^) 



- 1/2 ^ 

converges to x for appropriate R^. 

Another iteration, involving two variables 



'i+l ■ 

• 

- 1/2 1/2 

Here {R^} converges to x and to x 

It is interesting to note that if = N for all i in the first equation, 
that this is the iteration for the reciprocal of N, thus if this type of 
division were employed, the square root could be implemented easily. 

4.9 Product 

Unlike many functions, the product is easily implemented in a straight- 
forward fashion. A modification of the usual successive shift and add method, 
additive normalization, is suggested by DeLugish [16]. Ling [32] has given 
a method based on logical circuits to form the product of 8 bit segments of 
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numbers. Chen [4] gives another, somewhat similar algorithm, based on 

AB = (~ 2 ~) - ( 2 ) • Logan [36] has a similar proposal using "squaring" 

chips. And of course, the CORDIC algorithm will yield a product. 

In general we assume that we wish to compute yx, where 1/2 < x < 1 and 
y may be any number. 

4.9.1 CORDIC Algorithm 

Entering the CORDIC algorithm with m = 0, y = 0, in the rotation mode. 



yields y^^ ;; xz, where in this instance |z| < 1. We have y 



N +1 
o 



N N 

o o 

X(-J a ) , and I Z + ^ a | = 

i=0 ^ i=0 



'^N +ll “ “n ^ 

o o 



-N-1 



Thus, the truncation error is no more than 2 ^ ^|^|> and if |x| <1, the 

-N-1 -N-1 

bound is 2 . The roundoff error is also bounded by 2 for a total 

-N 

error of no more than 2 , or one in the last bit. 

4.9.2 *’Hand” Methods 

The usual shift and add method was outlined in Section 3.2. There is 

no truncation error, and N guard bits are required for no roundoff error 

(i.e., a double length register), but log^N 4- 1 guard bits renders roundoff 
-N-1 

error less than 2 , with chopping. 

DeLugish uses the following additive normalization. Let 



Initialize 



U = 2(U - m ) 

1 o o 



where 



P = P 4 ym , 
1 o o 



1/2 if 1/2 < X < 3/4 



m = 
o 



if 3/4 < X <1 
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For i = 1,2, ... ,N 



-1 if < -3/8 



s . = 
1 






1 if U > 3/8 
i 

0 otherwise 



then 



= 2U, - s. 
1+1 i 1 



■■i+i ' ^ • 

Letting ra^ = s^2 ^ ^ for i > 1, we have 



i-1 

= y I m. 
j=0 J 



i-1 

2^(x - 1 m. ) , and 

j=o J 



Thus the truncation error is 

N 

yx - = y(x - I m,). 



j=0 



N 



DeLugish shows that |x-J^ m| <3/8*2^, 

j=o 3 



thus the truncation error is bounded by |y|.3/8*2 ^<2^^ if |y| <1. 

Roundoff error again depends on the number of non-zero s^, which average 

about N/3 again. By an argument similar to that for division, it can be 

shown that no more than two successive s. can be non-zero, thus at most 

1 

about 2N/3 of the can be non-zero. Hence, about log22N/3 + 1 guard bits 

-N-1 

are sufficient to bound roundoff error by 2 . Total error is no more than 

one bit. 

4.9.3 Other Methods 

Several methods have been derived for obtaining the product by decomposing 
it into a sum of several other terms which are determined by means of logical 
circuits. Ling gave a method whereby the product of two 8 bit numbers is 
obtained in three additions. For longer word lengths, the idea is applied to 
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segments of the numbers in parallel, and then the product found by summing. 
Time is about 1 + 2k addition times, where the word length is 8 . , 

k > 1. 

Basically, the idea is to form 
= I = 2*^i , 1/2 < i < 1 

^ = J = 2"j , 1/2 . j < 1 . 

2 

Then, with S(z) = z - , S(i) and S(j) are formed by means of logical 

circuits. Then it is easily verified that 

xy = 2 {2’^I - 2®J - 2^"s(i) + 2^“s(j)}. 

As many as 26 terms involving seven variables occur in the definition of 
the bits of S (z ) . 

Chen uses the idea that xy = , and a decomposition 

of the square of a number into the sum of three quantities (for 8 bits; 

2 

two quantities for 4 or 6 bits), Z = R + S + T. 

The expressions for R, S, and T are decided upon by considering the 

"squaring parallelogram" that results when forming the square with pencil 

and paper. Symmetry can be exploited, and the decomposition given by Chen 

results in each bit involving no more than 14 terms and six variables. 

Logan [36] has a similar idea using squaring chips [35], except that 
2 2 2 

he forms 2xy = (x+y) - x - y . This would appear to be more costly in 

terms of both time and hardware than using xy = • 

The above procedures form a double length product, thus there is no 
error involved. 
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5.0 Conclusion 



Several conclusions can be drawn from this study, depending on the 
point of view taken. If simplicity and generality is the main objective, 
then the CORDIC algorithm of Walther [66] should be heavily favored. If 
speed is of principal importance, then the normalization method proposed 
by Chen [5] is excellent for its purpose. We note that Chen’s algorithm is 
the subject of a U. S. patent. For implementation with more conventional 
hardware, and a good compromise between simplicity and speed, the set of 
normalization algorithms proposed by DeLugish must be highly rated. 

The special purpose algorithms usually (although not always) are 
faster than those which are part of a unified algorithm. The minimal 
representation algorithms by Metze [41], [42] should be excellent for their 
purpose. The decomposition schemes for performing multiplication, proposed 
by Ling [32], Chen [4], and Logan [36], appear to be capable of high speeds. 

As was noted earlier, the only method available for computation of 
trigonometric functions is the CORDIC algorithm and variations of it. It 
appears that this is one area where some additional study is necessary to 
develop faster and more efficient algorithms. 
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Appendix A: Block Diagrams 
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PLOCK Pl.-vCiej'^f - CC:iDIC /■•LGORITHJ^ 
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BLOCK DIAGRAM - PSEUDO-MULTIPLICATION 



product^ f square 



Register Contents 

X pseudo-multiplicand, 

pseudo-product , 2 , 
multiplicand modifier 
X 

index, i 

index modifier,-^! 




shifter 
M <^c(M)2 



-i 



5 

M >e- 


i 

= (Z) 


i 




shifter 
M c(M)2“^^ 



£ 



I 



i 

M<- - 


i 

c(C) 


i 


i 


shifter 
M 4- c(M)2‘^'^^ 



hi+1 



I 



adder 




adder 


Z 4- c(Z)+c(X) 




X4- c{X)+c(M) 




M c ( C) 



scm ai fi^ fctn 

otherwise 



p(II) 



shifter 
Z 4- c(Z 






\ 


r 


adder 

I 4- c(I)+c(II) 



shifter 
M^ c(M)2 



-i-1 



,-i-l 



adder 
X ^ c(X)+c(M) 



fO and^^^ ^^ofT) ^ =0 or =n 
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BLOCK Jln'JRM - PSEUDO-DIVISION 




3 

M ^ - 


: 

c(B) 


3 


t 


shifter 
M <- c(M)2"^ 



shifter 
M-6- c(M)2 



• 2i 



adder 




shifter 




c(A) - c(B) 




Q ^ 2c(Q) 





Contents 

pseudo-dividend, z. 
pseudo-divisor, x. 
pseudo-quotient, 
dirisor modifier 
index, i 

X 



3 

M-6-c( 


i 

C) 






shifte 

c( 





'i+1 





^■i^<c(A) 


}i 2 


i 3 




jg n 




shifter 
A — 2c (A) 





M*- c 


(C) 


i 


1 


shifter 
M -»-c(M)2'^’^ 


3 


t 


adder 

B c(B)-c(m: 



d-1 
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BLOCK DIAGI?A^i - NOR14 ALIZaTION METKODC 
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BLOCK DIAGRAl-I - DE LUGISH MULTIPLICATION 
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BLOCK DIAGRAM - METZE DIVISION 
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BLOCK DIAGRAM - METZE SQUARE ROOT 
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Appendix B 



CORDIC Simulation Program 
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ooo ooo 



c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 



PRCGRAI'^ C 0 R D I C 

THIS FRGGRAN SIMLLATES THE GENERALIZED CORCIC ALGORITHM OF ^lALTHER 

PCM VALUES ARE CCHPUTED WITH ROUNDING TO NPB + NPG BITS ACCURACY 

IN THE ITERATIONS, CHOPPING IS USED AFTER MULT IPLI CAT ICNS . THIS 
CCPPESFCNCS TC A RIGHT SHIFT OFF THE END 



INPUTS ARE AS FOLLOWS 



M 

MODE 

NPE 

NGt) 



IC 



X 

Y 

Z 



IS M 

IS 1 FCR ROTATION, 2 FCR VECTORING 
IS THE NUMBER OF PRECISICN BITS OF THE COMPUTER 
IS THE NUMBER OF GUARD BITS WITH WHICH THE 
CALCULATIONS ARE PERFORMED (NOTE THAT WITH 32 BIT 
ARITHMETIC THE VALUE CF NPB+NGB MUST EE NO MORE THAN 
31 J 

OUTPUT IS GIVEN ON THE ITERATION WHEN I/IO IS AN 
INTEGER 

THE INPUT VALUE OF X ( SHCULD BE LESS THAN ONE) 

THE INPUT VALUE OF Y (SHOULD BE LESS THAN ONE) 

THE INPUT VALUE OF Z (IS IN CEGREES IF M = 1, ANC 
SHOULD BE 180 OR LESS IN MAGNITUDE. IF M IS -1 
CR 0 Z SHOULD BE LESS THAN ONE IN MAGNITUDE) 



CIMENSICN RCM(2,5C) ,NXI(32),NYI(32) ,NZI(32) 

INTEGER RCM,SFP,SFG,TSF,ZI ,XI,VI,XTEMP,0ELI,M,M0DE,NP6,D1, SZ,K 
1 , REPEAT, XCP, YCR ,ZCR,HSFG 
REALMS D,CATANH,RK,X,Y,Z,PI ,ANG,XC, YC, ZC 
CATANH(X) = .5D0*CLCG((1.D0 + X)/(1.CC - X)) 

PI = DATAN( l.DO) =»A.CC 
100 READ 1 ,M,MOC£,NPE ,NGE, 10 
IF (NPB.LE.O)STOF 
REPEAT = A 
SFP = 2’!‘*NPE 
SFG = 2**NGE 
HSFG = SFG/2 
TSF = SFP * SFG 

INITIALIZE RCM,CCMPLTE SCALE FACTOR 

RK = l.DO 
Cl = 1 

IF(M.LE.0)C1 = 2 
I = C 

J = C 

1^0 I = I + 1 

J = J + 1 

C = l.CO/Cl 
PCM(2,I) = TSF/Cl 
IF(M)lAC,15C,lfcC 

14C RCM(1,I) = CATANHIC )*TSF +.5D0 
GC TC 180 

15C RCM( 1,1) = TSF/Dl 
GC TC 180 

160 R0M(1,I) = CATAN(C)«TSF + .SCO 
180 RK = RK^d.CO + M*D**2) 

190 IF (M) 195,2CC,20C 
19S IF ( J.NE.REPEAT)GC TC 200 
1=1 + 1 

REPEAT = 3*REP£AT + 1 
RCM(1,1) = RCM(1,I-1) 

R0M(2,I) = R0M(2,I-1) 

RK = RK^d.CO + M=»D**2) 

^00 IF(D1.GT.SFP)G0 TC 2C1 
Cl = 2=»C1 
GC TC 120 
201 NMAX = I 

IF (NGE .LE.O )NMAX = NMAX - 1 
K = CSCRT(RK)*TSF + .SCO 
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PE/!D 2 »X, Y , Z 

PRINT 5,M,MCCE,NPE tNGB,X,Y,Z 

SCALE INPUTS TC INTEGER (N'PE + NGB BITS) AND DO FIRST 90 DEGREE 
RCTATICN 



IF(N.GT.O)GC TC 205 
ZI = Z’f'SFP 4 DSIGNI .5CC,Z) 

XI = X’i'SFP/K’S'TSF 4 DSIGNI .5D0,X) 

VI = V*SFF/K«TSF 4 DS IGN ( .500 , Y ) 

GO TC 250 

205 ANG = Z/18.C1=>PI 

GC TC ( 210,220 ,PCCE 
210 SZ = DSIGNI l.DCtZ) 

GC TC 2B0 

220 SZ = -DSIGNIl.DO.Y) 

230 IF ISZ .EQ.O) SZ = I 

ZI = IZ - SZ*90)/1.8D2*PI=4‘SFP 4 DSIGNI. 5D0,Z - SZ*90) 
XI = -SZ + Y^SFP/K’^TSF 4 D S I GN I . 5D0 ,-SZ=!= Y ) 

YI = SZ’^X*SFP/K*TSF 4 DS I GN I . 5D 0 , SZ «X ) 

250 ZI = ZI*SFG 
XI = XI*SFG 
YI = YI=i<SFG 



NOVv START THE RCTATICNS 

DC 4CC I=1,NRAX 
J = I - 1 

IFIMCDIJ,IC).NE.O)GC TC 280 
CALL BINARY IXI ,NXI ,32) 

CALL BINARYIYI,NYI,32) 

CALL BINARY IZI,NZI,32) 

PRINT 3, J,NXI ,NYI ,NZI ,XI,YI ,ZI 
280 CCNTiNLE 

GC TC 1300,210) ,NCDE 
200 SZ = I SIGNI 1, ZI ) 

GC TC 320 

210 SZ = -ISIGNI1,XI)=«ISIGNI1,YI) 

220 IFI SZ .EQ.O) SZ = 1 
DELI = -SZ*PCNI2, I) 

XTENP = XI 
IDLCSF = TSF/DELI 
XI = XI 4 R=»YI/ICLCSF 
VI = YI ' XTERP/ICLCSF 
ZI = ZI - SZ*RCMI 1, I ) 

400 CONTINUE 

CALL BINARY IXI ,NXI, 32) 

CALL EINARVIVI,NV 1,22) 

CALL EINARVIZ I,NZI,32) 

PRINT 2,NNAX,NXI ,NYI ,NZI ,XI , YI ,ZI 

SURMARY CF RESULTS 

LPTG = MODE 4 2*IR 4 1) 





GC 


TC 1410,420,430 


,440,450,460) 


-^10 


xc 


= X*DCCShIZ) 4 


Y^CSINHI Z) 




VC 


= V*CCCSFIZ) 4 


X*CSINHI Z) 




zc 


= C.DO 






GC 


TC 470 




420 


xc 


= DSQRTIX^^Z - 






VC 


= C.DO 






zc 


= Z 4 DATANHIY/X) 




GC 


TC 470 




430 


XC 


= X 






VC 


= V 4 X={=Z 






zc 


= O.DO 






GC 


TC 470 




440 


XC 


= X 






YC 


= O.DO 






ZC 


= Z 4 v/x 






GC 


TC 470 




450 


XC 


= X^CCCSIANG) - 


Y=tCSINIANG) 
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YC 




Y=^CCCS(ANG) ■* X=»DSIN(ANG) 




2C 




C.DO 






GO 


TC 


470 




460 


XC 


= 


CSGRT(X**2 4 Y*<2) 




YC 


= 


C.DO 






20 




ANG + 


CATAN2 ( V ,X) 


470 


XI 


= 


(XI + 


ISIGN (FSFG ,XI ) )/SFG 




YI 


= 


( YI + 


ISIGN (hSFG ,VI ) ) /SFG 




21 


= 


(21 + 


ISIGN(hSFG,ZI ))/SFG 



VCR = VC^SFF 
2CF = 2C«SFF 
PRINT 4, S'" 

GC TC 100 
FCRN/^T (1015) 
FORMAT (8010 .0) 

cr, CMAT/T-a Clliv 



*SFF + DSIGN( .5D0» VC) 
«SFF + CS IGN ( .500» 20 
SFP ,XI »>CR ,V1 ,VCR,2I,2CR 
0 



2 FORMAT (8010 .0) 

3 FORMAT (13,3 (IX, 32A1) ,3110) 

4 FORMAT (//'OSOMMARY OF RESULTS COMPUTEC CORRECT'/ 

1 'COENGMINATOR' ,1 12,3X,'X» ,2 1 11/27X , • Y • ,2 1 11 /27X , • 2 ' ,2111// 

2 16 (• *♦=♦•), • RF • ) 

i FORMAT(///'CM,MCCE,NFE,NGB' ,4I5/'0X,Y,2' , 1P3015.6/ *0 I', 

1 27X,'XI',31X, 'YI',31X,'ZI',1CX, 'XI' ,8X,*YI' ,8X,'ZI'//) 

ENC 



SUEROUTINE El NARY ( M , NO, NO ) 

0 

0 This ROUTINE CONVERTS THE NUMBER NI TC SIGNED BINARY (BCD) 

0 THE CHARACTERS ARE STORED IN THE ARRAY NI 

0 THE LEAST SIGNIFICANT CIGIT IS STORED IN NKND) 

0 IT IS ASSUMED THAT M IS NO BIGGER THAN 2**(ND-^) - 1 IN MAGNITUDE 

C 

DATA ISP,ISN,ISB,IS1,IS0 /lK+,iF-,lh ,1F1,1H0/ 

DINENSICN NC(1) 

DC ICC 1=1, ND 
IOC NC(I) = ISB 
IS = ISP 

IF (M .LT.O ) IS = ISM 
N = lABS(NI) 

CO 200 1=1, ND 
M = N/2 

K = ND - I + 1 
NC(K) = ISO 

IF (N.NE.2*M)NC(K) = ISl 
IF(M.NE.O)GC TC 2C0 
IF (K .GT .1 )NC( K-1 ) = IS 
RETURN 
20C N = M 
RETURN 
ENC 
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